Learning semantic sentence representations from visually grounded language without lexical knowledge
نویسندگان
چکیده
منابع مشابه
Learning Visually Grounded Sentence Representations
We introduce a variety of models, trained on a supervised image captioning corpus to predict the image features for a given caption, to perform sentence representation grounding. We train a grounded sentence encoder that achieves good performance on COCO caption and image retrieval and subsequently show that this encoder can successfully be transferred to various NLP tasks, with improved perfor...
متن کاملImproving Visually Grounded Sentence Representations with Self-Attention
Sentence representation models trained only on language could potentially suffer from the grounding problem. Recent work has shown promising results in improving the qualities of sentence representations by jointly training them with associated image features. However, the grounding capability is limited due to distant connection between input sentences and image features by the design of the a...
متن کاملLearning Visually Grounded Words and Syntax of Natural Spoken Language
Properties of the physical world have shaped human evolutionary design and given rise to physically grounded mental representations. These grounded representations provide the foundation for higher level cognitive processes including language. Most natural language processing machines to date lack grounding. This paper advocates the creation of physically grounded language learning machines as ...
متن کاملLearning Visually Grounded Common Sense Spatial Knowledge for Implicit Spatial Language
Spatial understanding is crucial for any agent that navigates in a physical world. Computational and cognitive frameworks often model spatial representations as spatial templates or regions of acceptability for two objects under an explicit spatial preposition such as “left” or “below” (Logan and Sadler 1996). Contrary to previous work that define spatial templates for explicit spatial language...
متن کاملVisually-Grounded Bayesian Word Learning
Learning the meaning of a novel noun from a few labeled objects is one of the simplest aspects of learning a language, but approximating human performance on this task is still a significant challenge for current machine learning systems. Current methods typically fail to find the appropriate level of generalization in a concept hierarchy for a given visual stimulus. Recent work in cognitive sc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Natural Language Engineering
سال: 2019
ISSN: 1351-3249,1469-8110
DOI: 10.1017/s1351324919000196